Blueprint components and configuration

Blueprints provide a YAML-based framework for defining custom connectors. This structured configuration allows you to connect to REST APIs and build scalable data pipelines without the need for custom scripting.

Blueprint YAML structure

Each Blueprint is composed of the following core components:

Component	Description	Required
`interface_parameters`	User-configurable inputs (authentication, filters, dates)	No
`connector`	API connection settings (base URL, headers, storage)	Yes
`steps`	Workflow logic (REST calls, loops, data extraction)	Yes

Example: Complete structure configuration

# 1. Interface Parameters - User inputs displayed in River configuration
interface_parameters:
  section:
    source:
      - name: "api_credentials"
        type: "authentication"
        auth_type: "bearer"
        fields:
          - name: "bearer_token"
            type: "string"
            is_encrypted: true
      - name: "date_range"
        type: "date_range"
        period_type: "date"
        format: "YYYY-mm-DD"
        fields:
          - name: "start_date"
            value: ""
          - name: "end_date"
            value: ""

# 2. Connector Configuration - API connection settings
connector:
  name: "My API Connector"
  base_url: "https://api.example.com/v1"
  default_headers:
    Content-Type: "application/json"
    Accept: "application/json"
  default_retry_strategy:
    500:
      max_attempts: 3
      retry_interval: 10
    429:
      max_attempts: 5
      retry_interval: 60
  variables_metadata:
    final_output_file:
      format: "json"
      storage_name: "results_dir"
  variables_storages:
    - name: "results_dir"
      type: "file_system"

# 3. Steps - Workflow logic
steps:
  - name: "Fetch Data"
    description: "Retrieve data from the API"
    type: "rest"
    http_method: "GET"
    endpoint: "{{%BASE_URL%}}/data"
    query_params:
      start_date: "{date_range.start_date}"
      end_date: "{date_range.end_date}"
    variables_output:
      - response_location: "data"
        variable_name: "final_output_file"
        variable_format: "json"
        transformation_layers:
          - type: "extract_json"
            from_type: "json"
            json_path: "$.data"

Connector Configuration

The connector configuration establishes the foundational settings for your API integration.

Configuration fields

Field	Description	Required
`name`	Descriptive name for the connector.	Yes
`base_url`	Root URL for all API requests.	Yes
`default_headers`	Headers sent with every request.	No
`default_retry_strategy`	Retry policies for failed requests.	No
`variables_metadata`	Variable storage configuration.	Yes
`variables_storages`	Storage location definitions.	Yes

Base URL

The base URL is the root endpoint for your API. All step endpoints are appended to this URL.

connector:
  name: "Salesforce Connector"
  base_url: "https://mycompany.salesforce.com/services/data/v58.0"

note

Do not include trailing slashes in the base URL.

Default headers

Headers that should be sent with every API request:

connector:
  default_headers:
    Content-Type: "application/json"
    Accept: "application/json"
    X-Custom-Header: "custom-value"

note

Do not include Authorization headers here - authentication is automatically injected from interface parameters.

Default retry strategy

Configure automatic retries for specific HTTP status codes. Each status code can have a unique maximum attempt count and interval.

Max attempts: The number of times the connector tries to re-establish the connection.
Retry interval: The duration (in seconds) to wait between attempts.

connector:
  default_retry_strategy:
    429:                    # Rate Limited
      max_attempts: 5
      retry_interval: 60    # seconds
    500:                    # Internal Server Error
      max_attempts: 3
      retry_interval: 10
    502:                    # Bad Gateway
      max_attempts: 3
      retry_interval: 10
    503:                    # Service Unavailable
      max_attempts: 3
      retry_interval: 30
    504:                    # Gateway Timeout
      max_attempts: 3
      retry_interval: 10

Variables storage

The variables storage configuration defines how and where the connector stores extracted data during execution.

connector:
  variables_metadata:
    final_output_file:
      format: "json"
      storage_name: "results_dir"
    intermediate_data:
      format: "json"
      storage_name: "results_dir"
  variables_storages:
    - name: "results_dir"
      type: "file_system"

Workflow steps

Steps define the execution logic of your connector. Steps execute sequentially. You can use data extracted from one step in subsequent steps.

Step types

The following table describes the available step types.

Type	Description
`rest`	Execute an HTTP request (GET, POST, PUT, PATCH, DELETE).
`loop`	Iterates over a data collection, executing nested steps for each element.

REST step

A REST step executes a single HTTP request.

steps:
  - name: "Get Users"
    description: "Fetch all users from the API"
    type: "rest"
    http_method: "GET"
    endpoint: "{{%BASE_URL%}}/users"
    query_params:
      status: "active"
      limit: "100"
    headers:
      X-Request-ID: "unique-id"
    retry_strategy:
      500:
        max_attempts: 3
        retry_interval: 10
    variables_output:
      - response_location: "data"
        variable_name: "users_list"
        variable_format: "json"
        transformation_layers:
          - type: "extract_json"
            from_type: "json"
            json_path: "$.data.users"

HTTP methods

The http_method determines the action the request performs on the resource.

Method	Description
`GET`	Retrieves data.
`POST`	Create data or send payloads.
`PUT`	Updates or replaces data.
`PATCH`	Applies partial updates to data.
`DELETE`	Removes data.

POST request with body

Use the body field to define the data payload when creating or updating a record.

steps:
  - name: "Create Record"
    description: "Create a new record via POST"
    type: "rest"
    http_method: "POST"
    endpoint: "{{%BASE_URL%}}/records"
    headers:
      Content-Type: "application/json"
    body:
      name: "{{%record_name%}}"
      email: "{{%record_email%}}"
      status: "active"
    variables_output:
      - response_location: "data"
        variable_name: "created_record"
        variable_format: "json"

Loop step

Loop steps iterate over collections of data and execute nested steps for each item.

steps:
  # Step 1: Fetch list of account IDs
  - name: "Get Account IDs"
    description: "Retrieve all account IDs"
    type: "rest"
    http_method: "GET"
    endpoint: "{{%BASE_URL%}}/accounts"
    variables_output:
      - response_location: "data"
        variable_name: "account_ids"
        variable_format: "json"
        transformation_layers:
          - type: "extract_json"
            from_type: "json"
            json_path: "$.accounts[*].id"   # Extract just the IDs

  # Step 2: Loop through each account ID
  - name: "Process Each Account"
    description: "Fetch details for each account"
    type: "loop"
    loop:
      type: "data"
      variable_name: "account_ids"
      item_name: "account_id"               # Each item IS the ID
      add_to_results: true
      ignore_errors: false
    steps:
      - name: "Get Account Details"
        type: "rest"
        http_method: "GET"
        endpoint: "{{%BASE_URL%}}/accounts/{{%account_id%}}/details"
        variables_output:
          - response_location: "data"
            variable_name: "final_output_file"
            variable_format: "json"
            overwrite_storage: false

important

The item_name represents the entire current item in the iteration. To use specific properties, for example, ID, extract them in the transformation layer so each loop item contains the required value.

Loop configuration options

The following table describes the fields for configuring a loop.

Field	Description	Required
`type`	Specifies the loop type: `data`, `date_range`, or `while`	Yes
`variable_name`	Identifies the variable containing the array to iterate.	Yes
`item_name`	Sets an alias for the current item in the iteration.	Yes
`add_to_results`	Includes the loop output in the final results.	Yes
`ignore_errors`	Continues the loop if individual items fail.	No

External variables loop

When a loop is the first step in your workflow, you can iterate over data passed from the source River using the {ext.} syntax:

steps:
  - name: "Process External IDs"
    description: "Loop through IDs from source River"
    type: "loop"
    loop:
      type: "data"
      variable_name: "{ext.source_ids}"  # External variable syntax
      item_name: "item_id"
      add_to_results: true
      ignore_errors: true
    steps:
      - name: "Fetch Item"
        type: "rest"
        http_method: "GET"
        endpoint: "{{%BASE_URL%}}/items/{{%item_id%}}"
        variables_output:
          - response_location: "data"
            variable_name: "final_output_file"
            variable_format: "json"
            overwrite_storage: false

note

Use the {ext.} prefix can only in the first step of a workflow.

Accessing external dictionary properties

When an external variable is a dictionary (object), you can access its properties using dot notation.

steps:
  - name: "Fetch Using External Config"
    type: "rest"
    http_method: "GET"
    endpoint: "{{%BASE_URL%}}/users/{{%{ext.config.user_id}%}}"
    query_params:
      region: "{{%{ext.config.region}%}}"

note

Dot notation for property access ({{%variable.property%}}) only works with external dictionary variables. For standard loop items, extract the specific values in the transformation layer.

Variable outputs and transformations

Define the variables_output object to specify how the connector handles response data.

Variable output configuration

variables_output:
  - response_location: "data"      # data, header, or status_code
    variable_name: "users_data"
    variable_format: "json"
    overwrite_storage: false       # Append (false) or replace (true)
    transformation_layers:
      - type: "extract_json"
        from_type: "json"
        json_path: "$.data.users[*]"

Response locations

The following table describes the available source locations for variables.

Location	Description
`data`	Extracts content from the response body.
`header`	Extracts values from the response headers.
`status_code`	Captures the numerical HTTP status code.

Transformation layers

Apply transformation layers to modify or filter response data before storing it.

transformation_layers:
  - type: "extract_json"
    from_type: "json"
    json_path: "$.data.items[*]"

Supported transformations

The following table lists the available transformation types.

Type	Description
`extract_json`	Extracts specific data using JSONPath syntax.
`extract_csv`	Parses incoming CSV data into a usable format.
`to_json`	Converts the data into JSON format.
`to_csv`	Converts the data into CSV format.

Common JSONPath patterns

# Direct property
json_path: "$.data"

# Nested property
json_path: "$.data.users"

# All array items
json_path: "$.data.users[*]"

# Specific field from all items
json_path: "$.data.users[*].id"

# Root array
json_path: "$[*]"

# Last item in array
json_path: "$.data[-1].id"

Variable reference syntax

Refer the following syntax patterns to inject dynamic data into your configuration.

Context	Syntax	Description	Example
Internal variable	`{{%variable_name%}}`	References data from a previous step.	`/users/{{%user_id%}}`
Loop item	`{{%item_name%}}`	References the current loop item value.	`/orders/{{%order_id%}}`
Interface parameter	`{param_name}`	References a user input value.	`?status={status_filter}`
Date range start	`{param.start_date}`	References a start date from a picker.	`from={dates.start_date}`
Date range end	`{param.end_date}`	References an end date from a picker.	`to={dates.end_date}`
External data	`{ext.variable_name}`	References data from a source River.	`{ext.incoming_ids}`
External dict property	`{{%{ext.dict.property}%}}`	References a property from a dictionary.	`{{%{ext.config.user_id}%}}`
Base URL	`{{%BASE_URL%}}`	References the connector base URL.	`{{%BASE_URL%}}/users`

Workflow patterns

The following patterns shows common implementation strategies for connector workflows.

Pattern 1: Simple data fetch

Single REST step to fetch data:

steps:
  - name: "Fetch All Records"
    type: "rest"
    http_method: "GET"
    endpoint: "{{%BASE_URL%}}/records"
    pagination:
      type: "page"
      location: "qs"
      parameters:
        - name: "page"
          value: 1
          increment_by: 1
        - name: "per_page"
          value: 100
      break_conditions:
        - name: "No More Data"
          condition:
            type: "empty_json_path"
            key_json_path: "$.data"
    variables_output:
      - response_location: "data"
        variable_name: "final_output_file"
        variable_format: "json"

Pattern 2: Parent-child relationship

Fetch a list, then get details for each item:

steps:
  # Step 1: Get parent records
  - name: "Get Organization IDs"
    type: "rest"
    http_method: "GET"
    endpoint: "{{%BASE_URL%}}/organizations"
    variables_output:
      - response_location: "data"
        variable_name: "org_ids"
        variable_format: "json"
        transformation_layers:
          - type: "extract_json"
            from_type: "json"
            json_path: "$.organizations[*].id"   # Extract just the IDs

  # Step 2: Loop through each and get child records
  - name: "Get Org Members"
    type: "loop"
    loop:
      type: "data"
      variable_name: "org_ids"
      item_name: "org_id"                        # Each item IS the org ID
      add_to_results: true
    steps:
      - name: "Fetch Members"
        type: "rest"
        http_method: "GET"
        endpoint: "{{%BASE_URL%}}/organizations/{{%org_id%}}/members"
        variables_output:
          - response_location: "data"
            variable_name: "final_output_file"
            variable_format: "json"
            overwrite_storage: false

Pattern 3: Sequential API calls

Multiple independent REST steps in sequence:

steps:
  # Step 1: Authenticate and get token
  - name: "Get Access Token"
    type: "rest"
    http_method: "POST"
    endpoint: "{{%BASE_URL%}}/auth/token"
    body:
      grant_type: "client_credentials"
    variables_output:
      - response_location: "data"
        variable_name: "auth_token"
        variable_format: "json"
        transformation_layers:
          - type: "extract_json"
            from_type: "json"
            json_path: "$.access_token"

  # Step 2: Use token to fetch data
  - name: "Fetch Protected Data"
    type: "rest"
    http_method: "GET"
    endpoint: "{{%BASE_URL%}}/protected/data"
    headers:
      Authorization: "Bearer {{%auth_token%}}"
    variables_output:
      - response_location: "data"
        variable_name: "final_output_file"
        variable_format: "json"

Best practices

Naming conventions: Use clear and descriptive names.

Example

# Good
variable_name: "user_profiles"
variable_name: "order_transactions"
variable_name: "final_output_file"

# Avoid
variable_name: "data"
variable_name: "temp"
variable_name: "x"

Error handling: Define configure retry strategies for common error codes.

Example

retry_strategy:
  429:
    max_attempts: 5
    retry_interval: 60
  500:
    max_attempts: 3
    retry_interval: 10

Pagination safety: Include break conditions in all paginated requests to prevent infinite loops.

Example

break_conditions:
  - name: "Primary: Empty Data"
    condition:
      type: "empty_json_path"
      key_json_path: "$.data"
  - name: "Safety: Page Size Check"
    condition:
      type: "page_size_break"
      page_size_param_name: "limit"
      items_json_path: "$.data"

Loop configuration:
- Set ignore_errors: true if the workflow must continue despite individual item failures.
- Set add_to_results: true to include loop outputs in the final result.
- Test workflows with small data sets before executing full production runs.
Security:
- Mark all sensitive fields with is_encrypted: true.
- Never hardcode credentials in YAML.
- Use interface parameters for all authentication.

Advanced features

The following advanced features are available for complex scenarios. For more information, refer to YAML reference guide.

Feature	Description
PUT/PATCH/DELETE methods	Additional HTTP methods for data modification.
Date range loops	Iterates through date chunks.
While loops	Repeats an execution block until a specific condition is met.
Pre-run configuration	Executes setup steps before the main workflow starts.
Multi-report configuration	Generates multiple report outputs from one workflow.
Advanced break conditions	Evaluates string equality, numeric values, or compound OR logic.
CSV transformations	Parses and converts data between JSON and CSV formats.
RFC8288 Link header pagination	Uses standard Link headers to manage paginated API responses.